Standardizing Quality Assessment of Observational Studies for Decision Making in Health Care

T his report summarizes a roundtable discussion of managed care experts and academic pharmacoeconomists, held in March 2008 with the objective of defining action steps to overcome barriers to the incorporation of real-world data into health care decision making. The roundtable meeting was the result of an initial program in July 2007 where managed care decision makers and pharmacoeconomic experts defined these barriers. 1 Real-world data in this context were characterized as data not routinely collected in Phase III drug registration studies, including administrative claims data, patient registries, large simple trials, resource use collection alongside clinical trials, and electronic medical records. 1,2 These data are considered along with standard safety, efficacy, and pricing information. However, one conclusion from the first roundtable discussion was that concerns around the quality assessment of such data have hindered widespread use in decision making.

T his report summarizes a roundtable discussion of managed care experts and academic pharmacoeconomists, held in March 2008 with the objective of defining action steps to overcome barriers to the incorporation of real-world data into health care decision making. The roundtable meeting was the result of an initial program in July 2007 where managed care decision makers and pharmacoeconomic experts defined these barriers. 1 Real-world data in this context were characterized as data not routinely collected in Phase III drug registration studies, including administrative claims data, patient registries, large simple trials, resource use collection alongside clinical trials, and electronic medical records. 1,2 These data are considered along with standard safety, efficacy, and pricing information. However, one conclusion from the first roundtable discussion was that concerns around the quality assessment of such data have hindered widespread use in decision making.
The Foundation of Managed Care Pharmacy (FMCP) recognized the value of real-world data in its AMCP Format for Formulary Submissions, version 2.1, a structured outline for the presentation of information by pharmaceutical companies on their products to managed care decision makers. 3 As a sponsor and developer of this standard, FMCP has invested in the adoption of the AMCP Format by manufacturers and health plans. Following a broad communication strategy, over 50 training seminars to managed care pharmacists on using the AMCP Format have been held. 4 The acceptance of the AMCP Format approach has improved since its inception in the year 2000; however, its impact on formulary decisions is still in its infancy. A recent survey found that approximately one-third of all pharmacy directors request information from drug manufacturers in a form that is consistent with the AMCP Format. 5 While information delivered in the dossier concerning the safety and efficacy for labeled use was perceived by the health plans to be mostly satisfactory, the information related to off-label use, costs, and benefits was perceived as incomplete, lacking in clarity, and potentially biased. These data indicate that despite extensive efforts, the implementation process has not yet been fully effective in promoting the utilization of real-world data.
The struggle for adoption of new processes in health care delivery is not new. In the United Kingdom (UK), an initiative called PARiHS (Promoting Action on Research Implementation in Health Services) has addressed the importance of process implementation, and in the United States a framework called REP (Replicating Effective Programs) has been established. 6,7 Both present conceptual frameworks and their application for transfer of health care knowledge from research into practice or from one organization to another from the perspective of implementation sciences. Within the circles of implementation science, a theory of "sticky knowledge" has been propagated, referring to the inherent resistance of the old process against the new. This leads to inefficient knowledge transfer as a significant barrier to implementation of new processes into the health care environment. [6][7][8][9] These publications highlight that a structured process must support the effective introduction of an innovation. This process consists of actions before the actual change (learning before doing) and actions after the first day of usage (learning by doing). The process involves the source of the knowledge, the recipient of the knowledge and the environment for the change, and it should be guided by personal facilitation. As related to the incorporation of real-world data into formulary decision making, the creation of a process or technique for using real-world data is only a starting point. Implementation requires a multistep process, to be carefully planned and executed.
Thus, while the 2007 meeting of managed care experts and academic pharmacoeconomists recognized the importance of real-world data, identified the potential barriers, and recommended methodological approaches to overcome those barriers, the issue of process implementation was not fully addressed. The methods and the planning of process implementation were targeted in 2008, where the participants focused on (a) integration of currently available tools for quality assessment of realworld data studies into 1 standard instrument; (b) creation of an implementation process for the dissemination of the instrument, including its validation by peer groups; and (c) establishment of a training certificate program to educate the potential users of such a tool. Here, we report the progress of the work and the resulting integrated action plan to support the developing body of research on this topic.

Methods
Prior to the roundtable event, participants were assigned to the following 4 workgroups: (1) Next steps for the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Task Force on Real-World Data; (2) Real-world data assessment instruments for researchers and users of research; (3) Acceptance process for an assessment instrument; and (4) Training and education on the use of an instrument. The roundtable program was structured in sequential steps, including preparatory workgroup activities, discussion and assessment at the roundtable, and work-up in the workgroups. Finally, all recommendations were collected, refined at the roundtable, and prioritized to yield an integrated action plan.

Results
The discussions are presented in the sequence of the 4 workgroups including the findings of the preparatory phase and the recommendations formed during the course of the workshop.

Workgroup 1: Continuation of the Work of the ISPOR Task
Force on the Use of Real-World Data. In response to the increasing request by decision makers for real-world outcomes information, 2 ISPOR created a Task Force on Real-World Data to develop a framework to assist health care decision makers in understanding real-world data in the context of application to reimbursement decisions. 2 The open comment period yielded over 70 comments from ISPOR members and these comments were summarized by the workgroup. The following core needs were identified: 2,10 (a) more guidance on defining a "Hierarchy of Evidence" that incorporates real-world data; and (b) guidance on how real-world data should be applied to reimbursement decisions, perhaps through guidelines or case studies. These key requests validated the overall premise of the roundtable discussion by expressing a strong need for specific guidance on interpreting real-world data.
Various systems have been suggested to grade the quality of evidence for clinical decision making. 2,[11][12][13][14][15][16][17][18][19] In a nutshell, the criteria for grading use either technical criteria to evaluate strength of research design and internal validity or the "net benefit" criterion (e.g., the magnitude of benefit compared to trade-offs). Discussion is ongoing about whether it is possible, or useful, to determine a "fixed" system for defining a hierarchy of evidence that can accommodate both criteria. 14 Critics fear that hierarchies of evidence are over-simplistic, pseudo-quantitative approaches to replace informed judgment. 20 Proponents state that such instruments will help producers and users of these data to orient themselves in a standard framework. The recommendation of the workgroup was that organizations vested in the creation and utilization of such evidence, such as ISPOR and AMCP, should evaluate the state of the art in developing and using "evidence hierarchies" and the feasibility of adopting a standard, more inclusive, hierarchy system.
In order to substantiate the potential for using real-world data for reimbursement decisions, the group analyzed 4 case studies where real-world data based evidence was used to make a decision involving conditional reimbursement. The first example was the conditional reimbursement for a targeted multiple sclerosis therapy of interferon and glatiramer acetate. The agreement between the UK authorities and the manufacturer was that drug costs would be reduced if a cost-effectiveness threshold of £36,000 per quality-adjusted life year over 20 years was exceeded based on an ongoing collection of real-world data. 21 The second example was the reimbursement scheme for bortezomib (Velcade) in the UK. The manufacturer agreed to rebate the full cost of the drug if the patient did not achieve at least a partial response to the treatment after a maximum of 4 treatment cycles as measured by a marker protein. 22 In the third case study, the reimbursement of the Oncotype Dx, a test for the identification of recurrence risk for women with early stage breast cancer, was deemed acceptable by the health insurer United Healthcare for 18 months if the test led to reduced volume of chemotherapy in those women who had been identified as low risk. 23 The final example of "Coverage with Evidence Development" has been suggested by the Centers for Medicare & Medicaid Services (CMS), whereby reimbursement is given for a limited time on the condition that evidence of effectiveness and/or safety of treatment is collected, often as part of real-world trials. 24,25 Agreements like the ones outlined above may allow additional data collection for new therapeutic interventions while limiting the risk for the payer. Conditions for reimbursement become linked to real-world effectiveness, as determined by the evaluation of real-world data. For such agreements it is important to use clear definitions of the type of data used, the quality of data requested, and the cut-off points for the reimbursement, and to agree on a manageable volume or length for data collection. In a subsequent step, further decisions can then be made based on improved real-world evidence on the treatment effectiveness.
To avoid biased reporting of the results of observational studies and to make sure that all existing evidence is accessible to decision makers, Workgroup 1 suggested a voluntary registry of observational studies, which should lead to increased credibility of these studies and the organizations conducting these studies. Transparency is important for models and real-world evidence. However, the degree of transparency depends, among other factors, on the level of expertise of the user of this information. To allow more standardized qualifications of models or real-world data studies, Workgroup 1 proposed the establishment of an independent body or review process, which could be formed as a consortium of experts giving access to a broad range of resources and expertise for an audit, review or quality certification process. Such a body could, for example, be associated with an organization like FMCP as a service to the AMCP membership.

Workgroup 2: Assessment Tool for Researchers and Users
of Research. The 4 managed care organization participants reviewed how instruments for assessing the quality of realworld data studies have been used in their organizations. Among them, none used a published instrument for assessment of the quality of real-world data studies currently, or had done so in the past. In order to consider such a tool, the instrument

International
Examining the value and quality of health economic analyses: implications of utilizing the QHES To create a quantitative approach to the appraisal of health economic studies would need to be short, simple to use, concise, and validated. In addition, recommendation by a national organization would be seen as a positive factor. With these objectives in mind, Workgroup 2 evaluated existing instruments for application towards the assessment of real-world data studies. The goal would be to consolidate these instruments into a useful tool for decision makers to evaluate real-world data.
Individual members of the workgroup nominated various instruments for inclusion in the evaluation. The resulting 13 published assessment tools were reviewed for suitability to critique real-world data studies. [26][27][28][29][30][31][32][33][34][35][36]38 Table 1 summarizes the evaluation tools included in this work. Prior to the meeting, members of the workgroup assessed the tools according to (a) target user (researcher, journal reviewer, systematic reviewer, decision maker); (b) study type (prospective, retrospective, modeling); (c) phase of research (planning, publishing, using); (d) type and number of domains used in the assessment tool; (e) grading system used; (f) strengths and weaknesses; and (g) geographic transferability. The perspectives of evaluation tools can be either from the viewpoint of the researcher (assurance of research quality), of the reviewer (assurance of research and publication quality), or of the decision maker (assurance of evidence quality and relevance). The instruments evaluated had been developed for a variety of study types, including modeling, 26,[29][30][31]33,35 nonrandomized studies, 27 budget impact analysis, 30 retrospective studies, 26,27,31,32,34 or studies of compliance and persistence. 32 The number of questions or criteria assessed varied considerably across the instruments. The mean number of criteria was 30.5 with a range of 10 to 69 questions. The structure of the questions was either by publication outline (objectives, background, methods, results, discussion, and limitations) or by methods (comparators, bias, and sensitivity analysis). Only 2 of the instruments indicated that they had been validated, 27,36 whereby a validation process involves determination of the congruency of independent assessments of the same research report by several evaluators using the instrument to be validated. For the majority of instruments, no updates were available. In general, yes-or-no questions were used in the evaluation as opposed to a scale. Although our review was not intended to be comprehensive of every real-world data assessment tool available, this evaluation of a selection of 13 assessment tools confirms the findings of a previous study performed by the Agency for Healthcare Research and Quality (AHRQ) in 2002. There are a variety of checklists, which may lead to different assessment results, and it is up to the user to select the most suitable tool. 39 This workgroup recommended the collection of all instruments available and classification according to their usefulness and impact on decisions from the viewpoint of a decision maker. The overriding objective of this work would be the development of 1 consolidated instrument in the form of a modular assessment tool with different axes by (a) study objective: economic impact or cost-effectiveness, heath outcomes, patient reported outcomes; and (b) study type: model, clinical prospective study, or retrospective data analysis. Workgroup 2 also acknowledged the importance of validating a consolidated instrument for quality assessment of real-world data by a broader group of users.
Consolidation does not necessarily mean the creation of a new additional instrument. Instead, it involves a thorough comparison of the existing instruments and selection of those criteria that are recognized as key indicators of the required quality assessment. In the September 2008 issue of JMCP, Fairman and Curtiss reported a comparison of a range of research and publication guidelines. 40 A similar comparison, with a focus on real-world database research, would form the foundation of a consolidated tool to quickly link assessment to the quality evaluation at hand. Consequently, the first step for a consolidated tool would be the development of a comprehensive list of criteria, which define the quality standard expectations by type of quality assessment need. However, such a comprehensive list must be balanced against a key request from the user perspective, which was to keep it short and simple. There are 2 different approaches to deal with this tension. To create a "user-friendly" tool, the items could be ranked to identify the 10 most important factors from the decision maker's perspective. Alternatively, an independent body or review process could decrease the skepticism of decision makers towards real-world data studies, without increasing the complexity in the individual decision-making process.

Workgroup 3: Process to Achieve Dissemination and
Acceptance of an Assessment Tool. One conclusion reached in the Fairman and Curtiss review of guidelines was that there is 1998  To test the feasibility of creating a valid and reliable checklist with the following features: appropriate for assessing both randomized and nonrandomized studies; providing both an overall score for study quality and a profile of scores not only for the quality of reporting, internal validity (bias and confounding) and power, but also for external validity.

Summary of Checklists Evaluated as Basis for the Roundtable Discussions (continued from previous page)
ample guidance existing, but that the use of it is limited. 40 Thus, guidelines are not widely adopted, and their existence does not guarantee a minimum quality standard of published research. Due to the demonstrated challenges regarding the uptake of tools such as the AMCP Format for Formulary Submissions 5 and implementation of evidence-based decisions in health care, [6][7][8]41 Workgroup 3 was assigned to outline a process for high-level agreement on the acceptance of an instrument for quality assessment to ensure widespread adoption of the instrument.
The initial recommendation was to establish an expanded, multidisciplinary advisory board composed of key decision makers and academic pharmacoeconomic researchers. The primary goal of this group would be to advance the content from the 3 workgroups to present to a larger, public "user forum" including professional bodies of decision makers and researchers, clinicians, employers, patient or quality assurance organizations. Potential stakeholders represented in such a forum could be the National Committee for Quality Assurance (NCQA) or the AHRQ, CMS, Congressional Budget Office, Wellpoint, Blue Cross Blue Shield, Kaiser, state Medicaid agencies, Department of Veterans Affairs, University HealthSystem Consortium [UHC], Institute of Medicine, World Association of Medical Editors, and the recently formed Pharmacy Quality Alliance. The objective would be to increase public awareness of the need for quality assessment of observational evidence and subsequently, the acceptance of observational studies meeting defined quality standards to be used in the decision-making process.
An alternative recommendation to a grass-roots initiative was the involvement of an independent body or review process for the "Quality Assessment of Real-World Information." Such an independent body could evaluate existing real-world data information, based on the assessment tool, and provide recommendations concerning the use of such information back to the users, such as decision makers. Such an institution could also be involved in the maintenance of the tools and in long-term studies on the accuracy of the predictions drawn from real-world data when executed within a health plan system, similar to case studies presented by Workgroup 1.
For either recommendation an important part of the process would be input into the dissemination and adoption of a consolidated instrument and a training platform for all stakeholders who may use the instrument.

Workgroup 4: Training and Education.
The objective of Workgroup 4 was to outline an education process to communicate the efficient and competent use of the assessment tool to all stakeholders. The goal of this initiative was to make training available to all potential users, including researchers, evaluators, journal editors, managed care representatives, physicians, and patient organizations. The workgroup determined that any training program would have to assume that participants start from a broad range of pre-existing knowledge.
An inventory was taken of existing educational resources of related professional societies. There are several sources for education on the use of pharmacoeconomic evidence in formulary decision making provided by ISPOR and AMCP/FMCP including workshops, short courses, live seminars and program content from annual meetings, which can be retrieved through the respective internet sites (www.amcp.org and www.ispor. org). The American College of Clinical Pharmacy (ACCP) has a training module on the use of pharmacoeconomics and outcomes research in patient care, 42 as well as guidelines for pharmacoeconomic fellowship training. 43 Other organizations such as Health Technology Assessment international, Society for Medical Decision Making, and AcademyHealth, offer links to existing training programs, but have no internal programs of their own. Workgroup 4 recommended the creation of a training certificate program on evaluating real-world data studies. The inventoried coursework could be provided through the Web as introductory courses, and advanced courses tying this work together to the application of real-world data in formulary decision making could be conducted face-to-face. The certificate program could then be supplemented by an ongoing mentoring option. The program was envisioned as a set of sequential modules. Once all the Web-based modules have been successfully completed, the participant qualifies for an interactive live program, or advanced course, on quality assurance for real-world studies. For the faceto-face advanced programs, a "speaker's bureau" was recommended with qualified trainers for these courses. Training could be offered from the associations to their members, or organizations or companies to train their employees could hire trainers. After the live advanced course, the participants could then enroll into an ongoing mentoring program to facilitate the uptake of the methodology in the participant's work routines. This mentoring program was envisioned as a "mentor's bureau" formed by the active users of the tool in their decision-making process. The reason for suggesting such a mentoring system is the experience that it is often difficult to transfer a newly acquired process or methodology into daily practice. While the newly learned process and method may seem to be clear in the classroom situation, obstacles often appear only in the practical application. A mentoring system may be a faster way of overcoming the obstacles and avoiding frustrations in the application. Conversely, a mentoring system could also help to bring the typical problems in the application back to the development team.
The development of the program should be financed by sources independent from manufacturers of products to be decided on. The recently passed economic stimulus package may provide government-based funding options through the AHRQ. The ongoing delivery of the training should be self-sustaining and financed by fees for participation and certification. A mandatory certificate for decision makers of leading organizations would fundamentally increase the utilization and adoption of the instrument and the standardization of the process.
Integrated Results of the 4 Workgroups. Workgroup 1 defined the current status around the need for real-world evidence through examples of its use in drug reimbursement decisions. Figure 1 depicts the overall process, which was elaborated by the 4 workgroups with the goal to improve the utilization of realworld data by decision makers. The core of this approach was to create a standardized instrument for quality assessment, which was led by Workgroup 2; then to develop a process for uptake and dissemination as outlined by Workgroup 3; and finally to support the establishment of this approach with a certified training series defined by Workgroup 4. The roundtable participants could serve as a steering committee for these activities and pursue funding procurement. A larger body, to include interaction with concerned stakeholders could be considered an "Interdisciplinary Board for the Quality Assurance of Real-World Data."

Discussion
The goal of the current roundtable discussions was to suggest processes through which to improve the quality and acceptance of studies based on real-world data to be used for decision making. The preceding event in 2007 had identified hurdles for the integration of such studies into the decision-making process and led to the need to begin formulation of a joint action plan developed by formulary decision makers from managed care and pharmacoeconomic research experts on how to overcome these obstacles. 1 The process recommended by the second roundtable includes: (a) the establishment of a standard quality assessment tool by consolidating previously suggested tools; (b) pilot testing and subsequent validation of the robustness of the instrument in a larger user forum; and (c) the communication of the tool through publications and workshops. In addition the creation of an oversight board for safeguarding credibility and dissemination was recommended along with a multistage training program with access to a growing pool of users of the instrument. Quality assessment Standardizing Quality Assessment of Observational Studies for Decision Making in Health Care using an assessment tool or process is expected to help diminish the research-to-practice gap 44 with respect to real-world data.
The process suggested in this roundtable has been formally described as "knowledge transfer," 45 defined as the implementation of knowledge by key stakeholders with the intention of improving health outcomes and efficiencies of the health care system. Change in health care does not happen easily, even if there is hard evidence for the advantages of the new direction or intervention. 6,7,44 The existence of quality processes such as the AMCP Format or quality assessment instruments as they were discussed during the roundtable event does not lead to their automatic adoption. 5,40 Hurdles along the way can be located on the individual, intra-organizational, or inter-organizational level. Insufficient financial, intellectual, or structural resources can limit acceptance of the new intervention. Those who should change often fall back on the familiar way of doing things despite evidence in support of the new ways. 5 Change has to be carefully planned and facilitated throughout both pre-and post implementation phases. How the intervention is packaged, training, technical assistance, and fidelity assessment are reported to be crucial to the successful implementation of effective interventions in health care. 6 However, the suggested process and tool are only first steps to improve the utilization and outcomes of decision making with real-world data. An iterative improvement process is mandatory. For example, the AMCP Format, which was originally published in the year 2000, was revised in 2002 (version 2.0) and 2005 (version 2.1), and further revisions will follow. 3 In addition to the expert groups helping to keep the AMCP Format up-to-date, surveys and studies have been conducted among the target audience on the utilization and usefulness of the document. 5,[46][47][48] This report only considers the validation of the instrument for consistencies of the results when used by different users. The important question, whether the inclusion of high quality real-world data and a standardization of the process of including these data in decision making will improve the decisions and their impact on health outcomes, is not addressed. In May 2006, an international collaborative initiative called the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network was launched in London. The goal of this initiative is "to enhance reliability of medical research literature by promoting transparent and accurate reporting of health research." 49 Assuring the quality of medical research literature will be in high synergy with the goals of the ideas discussed in the roundtable reported here. If such standards will be used more consistently, they may facilitate the task of decision makers to assess the impact of research results on the health of their membership.
Alternatives to the process suggested here were discussed throughout the roundtable session, such as the creation of a quality certification body or review process, which would as an intermediary certify the quality of real-world data and evidence and interpret the potential impact on health outcomes and budgets. The disadvantage of such an institution, in addition to the organizational and operational issues, would be that it would introduce an additional interpretation level into the process instead of increasing the general level of knowledge and experience within the organizations.
One key limitation of this discussion is the presumption that in most organizations a structured decision-making process exists and the issue is only how to incorporate assessment of real-world data into this process. This may not be the case for all health plans, pharmaceutical benefit management companies, or other organizations making decisions on drug formularies. However, following the suggested pathway will not only increase standardization of quality assessment of real-world data, but at the same time it will also direct some attention to the decision-making process in general and allow for a growing exchange among those involved in the drug formulary decision-making process.

Conclusions and Recommendations
The roundtable discussion between formulary decision makers of managed care and pharmacoeconomic academic experts led to a multistep action plan to increase the utilization of real-world data in decision making ( Table 2). The proposed process should involve all relevant stakeholders in the development, testing, validation, and dissemination of a consolidated quality assessment instrument. To increase the general level of qualification for users of real-world data among decision makers, a multilevel certificate-training program has been recommended. To facilitate the integration of the instrument into the user's organizational procedures, a mentoring program for all graduates of the training was proposed. • Formation of "Interdisciplinary Board for the Quality Assurance of Real-World Data" • Analysis of existing instruments for quality assessment of real-world data • Prioritization of assessment criteria depending on decision context • Consolidation of assessment tool • Validation of tool in small user group, improvement where needed • Pilot of tool in larger user group •Development of certificate training series (Web based training modules and face-to-face training)

Standardizing Quality Assessment of Observational Studies for Decision Making in Health Care
• Roll out of instrument and training program • Formation of mentoring group and system • Ongoing support and guidance through "Interdisciplinary Board for the Quality Assurance of Real-World Data"